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Applicants: Jonathan M. Rothberg, et al. 

Assignee: CuraGen Corporation 

Serial Number: 09/8 14,338 Examiner: Young J. Kim 

Filing Date: March 21, 2001 Art Unit: 1637 

FOR: METHOD OF SEQUENCING A NUCLEIC ACID 



Commissioner for Patents 
P. O. Box 1450 
Alexandria, VA 22313-1450 



DECLARATION OF MARCEL MARGULIES UNDER 37 CF.R. §1.132 

I, MARCEL MARGULIES, declare and state that: 



1. I am Vice President of Engineering, at 454® Life Sciences, the exclusive licensee 
of this application. My previous employment includes Director of New 
Technology Research at Perkin-Elmer ! s Instrument Division in Norwalk, CT, and 
Associate Director of the Hubble Space Telescope project. 

2. I earned my B.Sc. in Engineering from the Free University of Brussels, in 
Belgium, and a Ph.D. in theoretical physics from Columbia University. 

3. I have reviewed the instant application and the August 18, 2003 Office Action in 
this case. 



4. 



It is my opinion that the claimed invention represents the first massively parallel, 
solid-phase, whole-genome sequencing platform, which is vastly superior to 
previous sequencing technology for at least the reasons set out below. 



Applicants: Rothberg, et al. 
U.S.S.N. 09/814,338 

5. Although DNA sequencing was performed by Gilbert and Sanger as early as 
1977, the apparati claimed in the instant application are the first to allow rapid 
massively parallel sequencing (e.g., of whole viral or bacterial genomes). 
Traditional methods for genome sequencing have been slow, expensive, 
laborious, and industrial-scale, since they involve individually preparing and 
sequencing samples (DNA fragments) of the genome. The Human Genome 
Project, for example, required approximately 12 years, $2.7 billion dollars, and 60 
million samples to complete. 

6. In contrast, the substrates and apparati claimed in the instant application provide a 
massively parallel, scalable platform that dramatically reduces the time, cost, 
sample preparation, and space required for genome sequencing. Instead of 
individually preparing and sequencing each sample, the claimed substrates and 
apparati allow parallel sequencing of thousands (or hundreds of thousands) of 
samples. 

7. Recently, the claimed substrates and apparati were used to sequence the entire 
adenovirus genome (approximately 30,000 base pairs) contained on an expression 
vector in less than one day (see NY Times article, Ex. 1). The entire sequencing 
process from sample preparation to data analysis was accomplished in less than 
one day, and provided over 99% genome coverage. The resulting adenovirus 
sequence was published in GenBank under Accession Nos. AY370909, 
AY370910, and AY370911 (Ex. 2). 

8. To generate this sequence information we fabricated preferred commercial 
embodiments of the claimed substrates and apparati. In these preferred 
embodiment, the claimed substrate is termed a "PicoTiter Plate". The PicoTiter 
Plates used to generate the data referred to in Exs. 1, 2 and 3 were cavitated fiber 
optic wafers formed from a fused bundle of a plurality of individual optical fibers 
(as recited in the pending claims). Specifically, we fabricated PicoTiter Plates by 
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acid etching the top surface of fiber optic wafers to form wells with diameters 
between 39 and 44 |im (as recited in the pending claims). The fiber optic wafer 
has a thickness of about 2.0 mm (also as recited in the pending claims). In 
addition, we fabricated the wells on PicoTiter Plates with depths ranging from 26 
to 76 |am (i.e., from between one half the diameter of an individual optical fiber 
and three times the diameter of an individual optical fiber, as recited in the 
pending claims). Finally, we loaded the wells with nucleic acid template and a 
beads with pyrophosphate sequencing reagents attached thereto (as recited in the 
pending claims). Sequencing by synthesis was then performed as described in the 
specification, and using the claimed apparatus to flow sequencing reagents over 
the PicoTiter plate. 

9. In further experiments, the apparatus of the instant application was used to 
sequence a segment human chromosome 12 (approximately 170,000 base pairs) 
contained on an artificial chromosome vector (Ex. 3). With the apparati, a one- 
day sequencing run produced sufficient shotgun sequence coverage of the 
chromosome 12 clone (Ex. 3, p. 6). A single sequencing run obtained 85% 
genome coverage and 98% consensus accuracy (Ex. 3, p. 3). These results were 
presented at the 15th Annual Genome Sequencing and Analysis Conference, held 
on September 21-24, 2003 (Ex. 3, p. 1). 

10. The substrates and apparati claimed in the instant application therefore fulfill a 
long-felt but unmet need for rapid, whole-genome analysis of viral and bacterial 
pathogens (e.g., f 7 above). Such analysis is critical for biodefense, drug 
discovery, and the identification of emerging pathogens. More than this, the 
claimed apparati solve the long-standing problems with analysis of large 
genomes, such as in humans (e.g., f 9 above). Solutions for large-genome 
sequencing are vital for drug development, early diagnosis, and faster clinical 
interventions. 
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11. For these reasons, in my opinion, the claimed substrates and apparati represent a 
significant advancement in the field as the first massively parallel, solid-phase, 
whole-genome sequencing platform that can be scaled for viral, bacterial, and 
even human genomes. 

12. I further declare that all statements made herein of my own knowledge are true 
and that all statements made on information and belief are believed to be true; and 
further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under 18 U.S-C. § 1001 and that willful false statements may jeopardize the 
validity of this application and any patent issuing therefrom. 
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SYN 21-AUG-2003 



LOCUS AY370909S1 2617 bp DNA 

DEFINITION Expression vector pAdEasy-1, contig 1. 
ACCESSION AY370909 

VERSION AY370909.1 GI : 34014 917 

KEYWORDS 

SEGMENT 1 of 3 

SOURCE Expression vector pAdEasy-1 

ORGANISM Expression vector pAdEasy-1 

artificial sequences; vectors. 

REFERENCE 1 (bases 1 to 2617) 

AUTHORS Sarkis,G., Costa, G., Leamon,J., Maithreyan, S . , Berka,J., Du,L., 
Fierro,J., McDade,K., Puc,B., Roth, G . T . , Gomes, X. f Altman,W., 
Charumilind, J. , Chen, Y. -J., Chen,Z., de Winter, A., Dewell,S., 
Drake, J., Forte, R. , He,W., Helgesen,S., Jannotti, M. L. , Jarvie,T., 
Jirage,K., Kelch,K., Kim,J*-B., Kukanski,K., Lanza, J., Lee,W., 
Lef kowitz, S . , Lu,H., Makhi jani, V. , Margulies, M. , Nobile,J., 
Norton, W. , Reifler,M., Rodgers,G., Ronan,M., Simpson, J. , 
Tartaro,K., Verma,S., Zimmerman, Z . , Dacey,P., Begley,R. and 
Lohman, K. 

TITLE Sequence Analysis of the pAdEasy-1 Recombinant Adenoviral Construct 

Using the 4 54 Life Sciences Sequencing-by-Synthesis Method 
JOURNAL Unpub 1 i s hed 
REFERENCE 2 (bases 1 to 2617) 
AUTHORS Lohman, K. 
TITLE Direct Submission 

JOURNAL Submitted ( 18-AUG-2003) 454 Life Sciences, 20 Commercial Street, 
Branford, CT 06405, USA 
FEATURES Location/Qualifiers 
source 1 . .2617 

/organism="Expression vector pAdEasy-1" 
/mol_type="other DNA" 
/db_xref="taxon: 243021" 

/clone="Stratagene catalog number 240005" 
/note="contig 1; differs from pAdEasy-1 sequence from 
Stratagene; sequenced by new method" 

BASE COUNT 576 a 748 c 711 g 582 t 

ORIGIN 

1 aattaacatg catggatcct acgtctcgac cgatgccctt gagagccttc aacccagtca 
61 gctccttccg gtgggcggcg gggcatgact atcgtcgccg cacttatgac tgtcttcttt 
121 atcatgcaac tcgtaggaca ggtgccggca gcgctctggg tcattttcgg cgaggaccgc 
181 tttcgctgga gcgcgacgat gatcggcctg tcgcttgcgg tattcggaat cttgcacgcc 
241 ctcgctcaag ccttcgtcac tggtcccgcc accaaacgtt tcggcgagaa gcaggccatt 
301 atcgccggca tggcggccga cgcgctgggc tacgtcttgc tggcgttcgc gacgcgaggc 
361 tggatggcct tccccattat gattcttctc gcttccggcg gcatcgggat gcccgcgttg 
421 caggccatgc tgtccaggca ggtagatgac gaccatcagg gacagcttca aggatcgctc 
481 gcggctctta ccagcctaac ttcgatcatt gttggaccgc tgatcgtcac ggcgatttat 
541 gccgcctcgg cgagcacatg gaacgggttg gcatggattg taggcgccgc cctatacctt 
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601 gtctgcctcc ccgcgttgcg tcgcggtgca tggagccggg ccacctcgac ctgaatggaa 

661 gccggcggca cctcgctaac ggattcacca ctccaagaat tggagccaat caattcttgc 

721 ggagaactgt gaatgcgcaa accaaccctt ggcagaacat atccatcgcg tccgccatct 

781 ccagcagccg cacgcggcgc atctcgggca gcgttgggtc ctggccacgg gtgcgcatga 

841 tcgtgctcct gtcgttgagg acccggctag gctggcgggg ttgccttact ggttagcaga 

901 atgaatcacc gatacgcgag cgaacgtgaa gcgactgctg ctgcaaaacg tctgcgacct 

961 gagcaacaac atgaatggtc ttcggtttcc gttgtttcgt aaagtctgga aacgcggaag 

1021 tcagcgccct gcaccattat gttccggatc tgcatcgcag gatgctgctg gctaccctgt 

1081 ggaacaccta catctgtatt aacgaagcgc tggcattgac cctgagtgat ttttcttctg 

1141 gtcccgccgc atccataccg ccagttgttt accctcacaa cgttccagta accgggcatg 

1201 ttcatcatca gtaacccgtc atcgtgagca tcctctctcg tttcatcggt atcattaccc 

1261 ccatgaacaa gaaatccccc ttacacggag gcatcagtga ccaaacaagg aaaaaaccag 

1321 cccttaacat ggcccgcttt atcagaagcc agacattaac gcttctggag aaactcaacg 

1381 agctggacgc ggatgaacag gcagacatct gtgaatcgct tcacgaccac gctgatgagc 

1441 tttaccgcag ctgcctcgcg cgtttcggtg atgacggtga aaacctctga cacatgcagc 

1501 tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg 

1561 gcgcgtcagc gggtgttggc gggtgtcggg gcgcagccat gacccagtca cgtagcgata 

1621 gcggcagtgt atactggctt aactatgcgg catcagagca gattgtactg agagtgcacc 

1681 atatgcggtg tgaaataccg cacagatgcg taaggaagaa aataaccgca tcaggcgctc 

1741 ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc 

1801 agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa 

1861 catgtgagca aaaggccaag caaaaggcca aggaaccagt aaaaaggccg cgttgctggc 

1921 gtttttccat aggctccgcc cccctgacga gcatcacaaa atcgacgctc aagtcagagg 

1981 tggcgaaacc cgacaggact ataaagatac caggcgtttc ccctggaagc tccctcgtgc 

2041 gctctcctgt taccgaccct gccgcttacc ggatacctgt ccgcctttct tcccttcggg 

2101 aagcgtggcg ctttctcata agctcacgct gtaggtatct cagttcggtg taggtcgttc 

2161 gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 

2221 gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 

2281 ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 

2341 ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 

2401 ttaccttcgg aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg 

24 61 tggttttttg tttgcaagca gcagattacg cgcaagaaaa aaggaatctc aagaagattc 

2521 ctttgtattc ttttcttacg gggtgctgac gctcagtgga acgaaaactc acgttaaggg 
2581 attttggtca tgagattatc aaaaaggatc ttcacct 
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TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 



AY370909S2 1062 bp DNA 

Expression vector pAdEasy-1, contig 2. 
AY370910 

AY370910.1 GI: 34014918 



2 of 3 

Expression vector pAdEasy-1 
Expression vector pAdEasy-1 
artificial sequences; vectors. 

1 (bases 1 to 1062) 

Sarkis,G., Costa, G., Leamon,J., Maithreyan, S . , Berka,J., Du,L., 
Fierro,J., McDade,K., Puc,B., Roth,G.T., Gomes, X., Altman,W., 
Charumilind, J, , Chen,Y.-J., Chen,Z., de Winter , A. , Dewell,S., 
Drake, J., Forte, R. , He,W., Helgesen,S., Jannotti,M.L. , Jarvie,T., 
Jirage,K., Kelch,K., Kim, J.-B., Kukanski,K., Lanza, J., Lee,W., 
Lef kowitz, S . , Lu,H., Makhi jani, V. , Margulies, M. , Nobile,J., 
Norton, W., Reifler,M., Rodgers,G., Ronan,M., 
Tartaro,K., Verma/S., Zimmerman, Z . , Dacey,P. 
Lohman, K. 

Sequence Analysis of the pAdEasy-1 Recombinant Adenoviral Construct 
Using the 454 Life Sciences Sequencing-by-Synthesis Method 
Unpublished 

2 (bases 1 to 1062) 
Lohman, K, 

Direct Submission 



Simpson, J. , 
r Begley, R. and 
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JOURNAL 

FEATURES 

source 



BASE COUNT 
ORIGIN 

1 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
// 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SEGMENT 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 



Submitted (18-AUG-2003) 454 Life Sciences, 20 Commercial Street, 
Branford, CT 06405, USA 

Location/Qualifiers 

1. .1062 

/organism="Expression vector pAdEasy-1" 
/mol_type="other DNA" 
/db_xref="taxon: 243021" 

/clone="Stratagene catalog number 240005" 
/note="contig 2; differs from pAdEasy-1 sequence from 
Stratagene; sequenced by new method" 
282 a 253 c 236 g 290 t 1 others 



aaattaaaat gaatgtttta aatcaatcta aagtatatat gagtaaactt ggtctgacag 

ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtcttattt cgttcatcca 

tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 

ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 



accagccagc cggaagggcc gagcgcagaa 
agtctattaa ttgttgccgg gaagctagag 
acgttgttgc cattgctgca ggcatcgtgg 
tcagctccgg ttcccaacga tcaaggcgag 
ggttagctcc ttcggtcctc cgatcgttgt 
catggttatg gcagcactgc ataattctct 
ttgtgactgg tgagtactca accaagtcat 



gtggtcctgc aactttatcc gcctccatcc 
taagtagttc gccagttaat agtttgcgca 
tgtcacgctc gtcgtttggt atggcttcat 
ttacatgatc ccccatgttg tgcaaaaagc 
cagaagtaag ttggccgcag tgttatcact 
tactgtcatg ccatccgtaa gatgcttttc 
tctgagaata gtgtatgcgg cgaccgagtt 



gctcttgccc ggcgtcaaca cgggataata ccgcgccaca tagcaagaaa ctttaaaagt 

tagctcatca ttaggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt 

gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt 

caccagcgtt tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag 

ggcgacacgg aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta 

tcagggttat tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat 

aggggttccg cgcacatttc cccgaaagtg ccacctgtcn ag 



AY370909S3 30091 bp DNA 

Expression vector pAdEasy-1, contig 3. 
AY370911 

AY370911. 1 GI: 34 014 919 
3 of 3 

Expression vector pAdEasy-1 
Expression vector pAdEasy-1 
artificial sequences; vectors. 
1 (bases 1 to 30091) 
Sarkis,G., Costa, G., Leamon,J. 
Fierro,J., McDade,K., Puc,B., 



linear 
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Maithreyan, S . , Berka,J., Du,L., 
Roth,G.T., Gomes, X., Altman,W., 
Charumilind, J. , Chen, Y. -J- , Chen,Z., de Winter, A., Dewell,S., 
Drake, J., Forte, R. , He,W., Helgesen,S., Jannotti, M. L. , Jarvie,T. 



Kim, J.-B., Kukanski,K., Lanza, J, , Lee,W. , 
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/organism= n Expression vector pAdEasy-1" 
/mol_type="other DNA" 
/db_xref="taxon: 243021" 

/clone="Stratagene catalog number 240005" 
/note="contig 3; differs from pAdEasy-1 sequence from 
Stratagene; sequenced by new method" 

BASE COUNT 6985 a 8742 c 8241 g 6122 t 1 others 

ORIGIN 

1 caggtaggaa gaagtagtat aaggtggggt cttatgtagt tttgtatctt gttttgcagc 
61 agccgccgcc gccatgagca ccaactcgtt tgatggaagc attgtgagct catatttgac 
121 aacgcgcatg cccccatggg ccggggtggc gtcagaatgt gatgggctcc agcattgatg 
181 gtcgccccgt cctgcccgca aactctacta ccttgaccta cgagaccgtg tctggaacgc 
241 cgttggagac tgcagcctcc gccgccgctt cagccgctgc agccaccgcc cgcgggattg 
301 tgactgactt tgctttcctg agcccgcttg caagcagtca gcagcttccc gttcatccgc 
361 ccgcgatgac aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa 
421 tgtcgtttcg tcagcagctg ttggatctgc gccagcaggt ttctgccctg aaggcttcct 
4 81 cccctcccaa tgcggtttaa aacaataaaa taaaaaacca agactctgtt tggtatttgg 
541 atcaagcaag tgtcttgctg tctttattta ggggttttgc gcgcgcggta ggcccgggac 
601 cagcggtctc ggtcgttgag ggtcctgtgt tatttttcca ggacgtggta aaggtgactc 
661 tggatgttca gatacatggg cataagcccg tctctggggt ggaggtagca ccactgcaga 
721 gcttcatgct ggcggggtgg tgttgtagat gatccagtcg tagcaggagc ggctgggcgt 
781 ggtgcctaaa aatgtctttc agtagcaagc tgattgccag gggcaggccc ttggtgtaag 
841 tgtttacaaa gcggttaagc tgggatgggt gcatacgtgg ggatatgaga tgcatcttgg 
901 actgttattt ttaggtttgg ctatgttccc agccatatcc ctccggggat tcatgttgtg 
961 cagaaccacc agcacagtgt atccggtgca cttgggaaat ttgtcatgta gcttagaagg 
1021 aaatgcgtgg aagaacttgg agacgccctt gtgacctcca agattttcca tgcattcgtc 
1081 cataatgatg gcaatgggcc cacgggcggc ggcctgggcg aagatatttc tgggatcact 
1141 aacgtcatag ttgttgttcc aggatgagat cgtcataggc catttttaca aagcgcgggc 
1201 gggagggtgc cagactgcgg tataatggtt ccatccggcc caggggcgta gttaccctca 
1261 cagatttgca tttcccacgc tttgagttca gatgggggga tcatgtctac ctgcggggcg 
1321 atgaagaaaa cggtttccgg ggtaggggag atcagctggg aagaaagcag gttcctgagc 
1381 agctgcgact taccgcagcc ggtgggcccg taaatcacac ctattaccgg cctgcaactg 
1441 gtagttaaga gagctgcagc tgccgtcatc cctgaggcag gggggccact tcgttaagca 
1501 tgtccctgac tcgcatgttt tccctgacca aatccgccag aaggcgctcg ccgcccagcg 
1561 atagcagttc ttgcaaggaa gcaaagtttt tcaacggttt gagaccgtcc gccgtaggca 
1621 tgcttttgag cgtttgacca agcagttcca ggcggtccca cagctcggtc acctgctcta 
1681 cggcatctcg atccagcata tctcctcgtt tcgcgggttg gggcggcttt cgctgtacgg 
1741 cagtagtcgg tgctcgtcca gacgggccag ggtcatgtct ttccacgggc ggcagggtcc 
1801 tcgtcagcgt agtctgggtc acggtgaagg ggtgcgctcc gggctgcgcg ctggccaggg 
1861 tgcgcttgag gctggtcctg ctggtgctga agcgctgccg gtcttcgccc tgcgcgtcgg 
1921 ccaggtagca tttgaccgat ggtgtcatag tccagcccct ccgcggcgtg gcccttggcg 
1981 cgcagcttgc ccttggagga ggcgccgcac gaggggcagt gcagactttt gagggcgtag 
2041 agcttgggcg cgagaaatac cgattccggg gaggtaggca tccgcgccgc caggccccgc 
2101 agacggtctc gcattccacg agccaggtga gctctggccg ttcggggtca aaaaccaggc 
2161 tttcccccat tgctttttga tgcgtttctt acctctggtt tccatgagcc ggtgtccacg 
2221 ctcggtgacg aaaaggctgt ccgtgtcccc gtatacagac ttgagaggcc tgtcctcgag 
2281 cggtgttccg cggtcctcct cgtatagaaa ctcggaccac tctgagacaa aggctcgcgt 
2341 ccaggccagc acgaaggagg ctaaggtggg aggggtaggc ggtcgttgtc cactaggggg 
2401 tccactcgct ccagggtgtg aagacacatg tcgccctctt cggcatcaag gaaggtgatt 
24 61 ggtttgtagg tgtaggccac gtgaccgggt gttcctggaa ggggggctaa gtaaaagggg 
2521 gtgggggcgg cgttcgtcct cactctcttc cgcatcgctg tctgcgaggg ccagctgttg 
2581 gggtgagtac tccctctgaa aagcgggcat gacttctgcg ctaagattgt cagtttacca 
2641 aaaacagagg aggatttgat attcacctgg cccgcggtga tgcctttgag ggtggccgca 
2701 tccatctggt caagaaaaga acaatctttt gttgtcaagc ttggtggcaa acgacccgta 
27 61 gagggcgttg gacagcaact tggcgatgga gcggcagggt ttggtttttg ttcgcgatcg 
2821 gcgcgctcct tggccgcgat gtttagctgc acgtattcgc gcgcaacgca ccgccattcg 
2881 ggaaagacgg tggtgcgctc gtcgggcacc aggtgcacgc gccaaccgcg gttgtgcagg 
2941 gtgacaaggt caacgctggt ggctacctct ccgcgtaggc gctcgttggt ccagcagagg 
3001 cggccgccct tgcgcgagca gaatggcggg tagggggtct agctgcgtct cgtgccgggg 
3061 ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc gtcgaagtag tctatcttgc 
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3121 atccttgcaa gtctagcgcc tgctgccatg 
3181 tgaggtgggg gaccccatgg catggggtgg 
3241 tcgtaaacgt aggaggggct ctctgagtat 
3301 gcggatgctg gcgcgcacgt aatcgtatag 
3361 gaggttgcta cgggcgggct gctctgctcg 
3421 gttggatgat atggttggac gctggaagac 
3481 gtcacgcacg aaggaggcgt aggagtcgcg 
3541 cacgtctagg gcgcagtagt ccagggtttc 
3601 tttttttcca cagctcgcgg ttgaggacaa 
3661 tcggaaaccc gtcggcctcc gaacggtaag 
3721 ggtaggcgca gcatcccttt tctacgggta 
3781 aggtgtgggt gagcgcaaag gtgtccctga 
3841 cagtgtcgtc gcatccgccc tgcctcccag 
3901 ggatttggca gggcggaagg tgacatcgtt 
3961 agttgcgtgt gatgcggaag ggtcccggca 
4021 cgagcacgat ctcgtcaaag ccgttgatgt 
4081 gcgggatgcc cttgatggaa ggcaattttt 
4141 agctgagccc gtgctctgaa agggcccagt 
4201 agctccacag gtcacgggcc attagcattt 
4261 cgacctatgg cctatttttt cttggggtgg 
4321 cagcggtccc atccaaggtt cgcggctagg 
4381 ccgccgaact tcatgaccag catgaagggc 
4441 gtataggtct ctacatcgta ggtgacaaag 
4501 gggaagaact ggatctcccg ccaccaattg 
4561 gaagtccctg cgacgggccg aacactcgtg 
4621 ggcagcggtg cacgggctgt acatcctgca 
4681 agcagagtgg gaatttgagc ccctccgcct 
4741 gctgcttgtc cttgaccgtc tggctgctcg 
4 801 ccgcgcgagc ccaaagtcca gatgtccgcg 
4861 cgcagatggg agctgtccat ggtctggagc 
4921 tgcaggttta cctcgcatag acgggtcagg 
4981 ttccaggggc tggttggtgg cggcgtcgat 
5041 gactacggta ccgcgcggcg ggcggtgggc 
5101 aaagcggtga cgcgggcgag cccccggagg 
5161 gggggcaggg gcacgtcggc gccgcgcgcg 
5221 ctggcgaacg cgacgacgcg gcggttgatc 
5281 acgggcccgg tgagcttgaa aacctgaaag 
5341 tgacggcggc ctggcgcaaa atctcctgca 
5401 cggccatgaa ctgctcgatc tcttcctcct 
5461 tggcggcgag gtcgttggaa atgcgggcca 
5521 cgttccagac gcggctgtag accaccgccc 
5581 ctgcgcgaga ttgagctcca cgtgccgggc 
5641 gaggtagttg agggtggtgg cggtgtgttc 
5701 caacgtggat tcgttgatat cccccaaggc 
5761 cacggcgaag ttgaaaaact gggagttgcg 
5821 acggatgagc tcggcgacag tgtcgcgcac 
5881 ttcttcttca atctcctctt ccataagggc 
5941 gggagggggg acacggcggc gacgacggcg 
6001 catctccccg ccggcgacgg cgcatggtct 
6061 cgcagttgga agacgccgcc cgtcatgtcc 
6121 cggcagggat acggcgctaa cgatgcatct 
6181 gagggacctg agcgagtccg catcgaccgg 
6241 ccagtcacag tcgcaaggta ggctgagcac 
6301 ggttgtttct ggcggaggtg ctgctgatga 
6361 ggatggtcga cagaagcacc atgtccttgg 
6421 ccatgcccca ggcttcgttt tgacatcggc 
6481 tttctaccgg cacttcttct tctccttcct 
6541 cggcggcggc ggagtttggc cgtaggtggc 
6601 agcccctcat cggctgaagc agggctaggt 
6661 gctgcacctg cgtgagggta gactggaagt 



cgcgggcggc aagcgcgcgc tcgtatgggt 
gtgagcgcgg aggcgtacat gccgcaaatg 
tccaagatat gtagggtagc atcttccacc 
ttcgtgcgag ggagcgagga ggtcgggacc 
gaagactatc tgcctgaaga tggcatgtga 
gttgaagctg gcgtctgtga gacctaccgc 
cagcttgttg accagctcgg cggtgacctg 
cttgatgatg tcatacttat cctgcttccc 
actcttcgcg gtctttccag tactcttgga 
agcctagcat gtagaactgg ttgacggcct 
gcgcgtatgc ctgcgcggcc ttccggagcg 
ccatgacttt gaggtactgg tatttgaagt 
aagcaaaagt ccgtgcgctt tttggaacgc 
gaagagtatt ctttcccgcg cgaggcataa 
cctcggaacg gttgttaatt acctgggcgg 
tgtggcccac aatgtaaagt tccaagaagc 
aagtttcctc gtaggtgagc tcttcagggg 
ctgcaagatg agggttggaa gcgacgaatg 
gcaggtggtc gcgaaaggta cctaaactgg 
atgcagtaga aggtaagcgg gtcttgttcc 
tctcgcgcgg cagtcactag aggctcatct 
acgagctgct tcccaaaggc ccccatccaa 
agacgctcgg tgcgaggatg cgagccgatc 
gaggagtggc tattgatgtg gtgaaagtaa 
cttggctttt gtaaaaacgt gcgcagtact 
cgaggttgac ctgacgaccg cgcacaagga 
ggcgggtttg gctggtggtc ttctacttcg 
aggggagtta cggtggatcg gaccaccacg 
cgcggcggtc ggagcttgat gacaacatcg 
tcccgcggcg tcaggtcagg cgggagctcc 
gcggcgggct agatccaggt gatacctaat 
ggcttgcaag aggccgcatc cccgcggcgc 
cggcgggggt gtccttggat gatgcatcta 
gtaggggggg ctccggaccc gccgggagga 
ggcaggagct ggtgctgcgc gcgtaggttg 
tcctgaatct ggcgcctctg cgtgaagacg 
agagttcgac agaatcaatt tcggtgtcgt 
cgtctcctga gttgtcttga taggcgatct 
ggagatctcc gcgtccggct cgctccacgg 
tgagctgcga gaaggcgttg aggcctccct 
ccttccggca tcgcgggcgc gcatgaccac 
gaagacggcg tagtttcgca ggcgctgaaa 
tgccacgaag aagtacataa cccagcgtcg 
ctcaaggcgc tccatggcct cgtagaagtc 
cgccgacacg gttaactcct cctccagaag 
ctcgcgctca aaggctacag gggcctcttc 
ctccccttct tcttcttctg gcggcggtgg 
caccgggagg cggtcgacaa agcgctcgat 
cggtgacggc gcggccgttc tcggcggggg 
cggttatggg ttgggcgggg ggctgccatg 
caacaattgt tgtgtaggta ctccgccgcc 
atcggaaaac ctctcgagaa aggcgtctaa 
cgtggcgggc ggcaggcggg cggcggtcgg 
tgtaattaaa gtaggcggtc ttgagacggc 
gtccggcctg ctgaatgcgc aggcggtcgg 
gcaggtcttt gtagtagtct tgcatgagcc 
cttgtcctgc atctcttgca tctatcgctg 
gccctcttcc tcccatgcgt gtgaccccga 
cggcgacaac gcgctcggct aatatggcct 
catccatgtc cacaaagcgg tggtatgcgc 
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6721 ccgtgttgat ggtgtaagtg cagttggcca taacggacca gttaacggtc tggtgacccg 
6781 gctgcgagag ctcggtgtac ctgagacgcg agtaagccct cgagtcaaat acgtagtcgt 
6841 tgcaagtccg caccaggtac tggtatccca ccaaaagtgc ggcggcggct ggcggtagga 
6901 ggggccaggc gtagggtggg ccggggctgc cgggggcgga gatcttccaa cataaggcga 
6961 tgatatccgt agatgtacct ggacatccag gtgatgccgg cggcggtggt ggaggcgcgc 
7021 ggaaagtcgc ggacgcggtt ccagatgttg cgcagcaggc aaaagtagct ccatggtcgg 
7081 gacgctctgg ccggtcaggc gcgcgcaatc gttgacgctc tagcgtgcaa aaggagagcc 
7141 tgtaagcggg cactcttccg tggtctggtg gataaattcg caagggtatc atggcggacg 
7201 accggggttc gagccccgta tccggccgtc cgccgtgatc catgcggtta ccgcccgcgt 
7261 gtcgaaccca ggtgtgcgac gtcagacaag cgggggagtg cttccttttg gcttccttcc 
7321 aggcgcggcg gctgctgcgc ttagcttttt ggccactggc cgcgcgcagc gtaagcggtt 
7381 aggctggaaa gcgaaagcat taagtggctc gctccctgta gccggagggt tattttccaa 
7441 gggttgagtc gcgggacccc cggttcgagt ctcggaccgg ccggactgcg gcgaacgggg 
7501 gtttgcctcc ccgtcatgca agaccccgct tgcaaattcc tccggaaaca gggacgagcc 
7561 ccttttttgc cttttcccag atgcatccgg tgctgcggca gatgcgcccc cctccctcag 
7621 cagcggcaag agcaagagca gcggcagaca tgcagggcac cctcccctcc ctcctaccgc 
7681 gtcaggaggg gcgacatccg cggttgacgc ggcagcagat ggtgattacg aacccccgcc 

77 41 ggcgccgggc ccggcactac ctggacttgg aggagggcgg agggcctggc gcggctagga 
7801 gcgccctctc ctgagcggcc cacccaaggg atgcagctga agcgtgatac gcgtgaggcg 

78 61 tacgtgccgc ggcagaacct gtttcgcgac cgcgagggag aggagcccga ggagatgcgg 
7921 gatcgaaagt tccacgcagg gcgcgagctg cggcatggcc tgaatcgcga gcggttgctg 
7981 cgcgaggagg actttgagcc cgacgcgcga accgggatta gtcccgcgcg cgcacacgtg 
8041 gcggccgccg acctggtaac cgcatacgag cagacggtga accaggagat taactttaca 
8101 aaaagcattt aaacaaccac gtgcgtacgc ttgtggcgcg cgaggaggtg gctataggac 
8161 tgatgcatct gtgggacttt gtaagcgcgc tggagcaaaa cccaaataag caagccgctc 
8221 atggcgcagc tgttccttat agtgcagcac agcagggaca acgaggcatt cagggatgcg 
8281 ctgctaaaca tagtagagcc cgagggccgc tggctgctcg atttgataaa catcctgcag 
8341 agcatagtgg tgcaggagcg cagcttgagc ctggctgaca aggtggccgc catcaactat 
84 01 tccatgctta gcctgggcaa gttttacgcc cgcaagatat accatacccc ttacgttccc 
84 61 atagacaagg aggtaaagat cgaggggttc tacatgcgca tggcgctgaa tggtgcttac 
8521 cttgagcgac gacctgggcg tttatcgcaa cgagcgcatc cacaaggccg tgagcgtgag 
8581 ccggcggcgc gagctcagcg accgcgagct gatgcacagc ctgcaaaggg ccctggctgg 
8641 cacgggcagc ggcgatagag aggccgagtc ctactttgac gcgggcgctg acctgcgctg 
8701 ggccccaagc cgacgcgccc tggaggcagc tggggccgga cctgggctgg cggtggcacc 
8761 cgcgcgcgct ggcaacgtcg gcggcgtgga ggaatatgac gaggacgatg agtacgagcc 
8821 agaggacggc gagtactaag cggtgatgtt tctgatcaga tgatgcaaga cgcaacggac 
8881 ccggcggtgc gggcggcgct gcagagccag ccgtccggcc ttaactccac ggacgactgg 
8941 cgccaggtca tggaccgcat catgtcgctg actgcgcgca atcctgacgc gttccggcag 
9001 cagccgcagg ccaaccggct ctccgcaatt ctggaagcgg tggtcccggc gcgcgcaaac 
9061 cccacgcacg agaaggtgct ggcgatcgta aacgcgctgg ccgaaaacag ggccatccgg 
9121 cccgacgagg ccggcctggt ctacgacgcg ctgcttcagc gcgtggctcg ttacaacagc 
9181 ggcaacgtgc agaccaacct ggaccggctg gtgggggatg tgcgcgaggc cgtggcgcag 
9241 cgtgagcgcg cgcagcagca gggcaacctg ggctccatgg ttgcactaaa cgccttcctg 
9301 agtacacagc ccgccaacgt gccgcgggga caggaggact acaccaactt tgtgagcgca 
9361 ctgcggctaa tggtgactga gacaccgcaa agtgaggtgt accagtctgg gccagactat 
9421 ttttccagac cagtagacaa ggcctgcaga ccgtaaacct gagccaggct ttcaaaaact 
9481 tgcaggggct ggtggggggt ggcgggctcc cacaggcgac cgcgcgaccg tgtctagctt 
9541 gctgacgccc aactcgcgcc tgttgctgct gctaatagcg cccttcacgg acagtggcag 
9601 cgtgtcccgg gacacatacc taggtcactt gctgacactg taccgcgagg ccataggtca 
9661 ggcgcatgtg gacgagcata ctttccagga gattacaagt gtcagccgcg cgctggggca 
9721 ggaggacacg ggcagcctgg aggcaaccct aaactacctg ctgaccaacc ggcggcagaa 
9781 gatcccctcg ttgcacagtt taaacagcga ggaggagcgc attttgcgct acgtgcagca 
9841 gagcgtgagc cttaacctga tgcgcgacgg ggtaacgccc agcgtggcgc tggacatgac 
9901 cgcgcgcaac atggaaccgg gcatgtatgc ctcaaaccgg ccgtttatca accgcctaat 
9961 ggactacttg catcgcgcgg ccgccgtgaa ccccgagtat ttcaccaatg ccatcttgaa 

10021 cccgcactgg ctaccgcccc ctggtttcta caccggggga ttcgaggtgc ccgagggtaa 
10081 cgatggattc ctctgggacg acatagacga cagcgtgttt tccccgcaac cgcagaccct 
10141 gctagagttg caacagcgcg agcaggcaga ggcggcgctg cgaaaggaaa gacttccgca 
10201 ggccaagcag cttgtccgat ctaggcgctg cggccccgcg gtcagatgct agtagcccat 
10261 ttccaagctt gatagggtct cttaccagca ctcgcaccac ccgcccgcgc ctgctgggcg 
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10321 aggaggagta cctaaacaac tcgctgctgc 
10381 catttcccaa ccaacgggat agagagccta 
10441 gcgcaggagc acagggacgt gccaggcccg 
10501 ccgtcagcgg ggtctggtgt gggaggacga 
10561 tttgggaggg agtggcaacc cgtttgcgca 
10621 ttaaaaaaaa aaaaatagat gcaaaataaa 
10681 tggttttctt gtattcccct tagtatgcgg 
10741 ccctcctacg agagtgtggt gagcgcggcg 
10801 cgatgctccc ctggacccgc cgtttgtgcc 
10861 aaacagcatc cgttactctg agttggcacc 
10921 ggacaacaag tcaacggatg tggcatccct 
10981 gaccacggtc attcaaaaca atgactacag 
11041 tcttgacgac cggtcgcaca tggggcggcg 
11101 caaatgtgaa cgagttcatg tttaccaata 
11161 ttgcctacta aggacaatca ggtggagctg 
11221 gagggcaact actccgagac catgaccata 
11281 tacttgaaag tgggcagaca gaacggggtt 
11341 acacccgcaa cttcagactg gggtttgacc 
11401 atacaaacga agccttccat ccagacatca 
114 61 cccacagccg cctgagcaac ttgttgggca 
11521 ttaggatcac ctacgatgat ctggagggtg 
11581 cctaccaggc gagcttgaaa gatgacaccg 
11641 acagcagtgg cagcggcgcg gaagagaact 
11701 tggaggacat gaacgatcat gccattcgcg 
117 61 agcgcgctga ggccgaagca gcggccgaag 
11821 gagaagcctc agaagaaacc ggtgatcaaa 
11881 tacaacctaa taagcaatga cagcaccttc 
11941 aactacggcg accctcagac cggaatccgc 
12001 acctgcggct cggagcaggt ctactggtcg 
12061 ttccgctcca cgcgccagat cagcaacttt 
12121 cactccaaga gcttctacaa cgaccaggcc 
12181 tctctgaccc acgtgttcaa tcgctttccc 
12241 ccccaccatc accaccgtca gtgaaaacgt 
12301 gctgcgcaac agcatcggag gagtccagcg 
12361 ctgcccctac gtttacaagg ccctgggcat 
12421 ttttgagcaa gcatgtccat ccttatatcg 
12481 ttcccaagca agatgtttgg cggggccaag 
12541 cgcgggcact accgcgcgcc ctggggcgcg 
12601 gtcgatgacg ccatcgacgc ggtggtggag 
12661 accagtgtcc acagtggacg cggccattca 
12721 taaaatgaag agacggcgga ggcgcgtagc 
12781 cgcccaacgc gcggcggcgg ccctgcttaa 
12841 catgcgggcc gctcgaaggc tggccgcggg 
12901 acgagcggcc gccgcagcag ccgcggccat 
12961 cgtgtattgg gtgcgcgact cggttagcgg 
13021 cgccgcaact agattgcaag aaaaaactaa 
13081 ggcggcggcg cgcaacgaag ctatgtccaa 
13141 catcgcgccg gagatctatg gccccccgaa 
13201 gctaaagcgg gtcaaaaaga aaaagaaaag 
13261 gaactgctgc acgctaccgc gcccaggcga 
13321 cgtgttttgc gacccggcac caccgtagtc 
13381 ctacaagcgc gtgtatgatg aggtgtacgg 
13441 agcgcctcgg ggagtttgcc tacggaaagc 
13501 acgagggcaa cccaacacct agcctaaagc 
13561 ttgcaccgtc cgaagaaaag cgcggcctaa 
13621 tgcagctgat ggtacccaag cgccagcgac 
13681 aacctgggct ggagcccgag gtccgcgtgc 
13741 gcgtgcagac cgtggacgtt cagataccca 
13801 cagagggcat ggagacacaa acgtccccgg 
13861 aggcggtcgc tgcggccgcg tccaagacct 



agccgcagcg cgaaaaaacc tagcctccgg 
gtggacaaga tgagtagatg gaagacgtac 
ccgcccgccc acccgtcgtc aaaggcacga 
tgactcggca gacgacagca gcgtcctgga 
ccttcgcccc aggcctgggg aggaatagtt 
aaactacacc aaggccatgg caccgagcgt 
cgcgcggcga tgtatgagga aggtcctcct 
ccagtggcgg cggcggctgg gttctccctt 
tccgcggtac ctgcggccta ccggggggag 
cctattcgac accacccgtg tgtacctggt 
gaactaccag aacgaccaca gcaactttct 
cccgggggag gcaagcacac agaccatcaa 
acctgaaaac catcctgcat accaacatgc 
agtttaaggc ggcgggtgat ggtgtcgcgc 
aaatacgagt gggtggagtt cacgctgccc 
gaccttatga acaacgcgat cgtggagcac 
gctggaaagc gacatcgggg taaaggtttg 
ccgtcactgg tcttgtcatg cctggggtat 
ttttgctgcc aggatgcggg gtggacttca 
tccgcaagcg gcaacccttc caggagggct 
gtaacattcc cgcactgttg gatgtggacg 
aacagggcgg gggtggcgca ggcggcagca 
ccaacgcggc agccgcggca atgcagccgg 
gcgacacctt tgccacacgg gctgaggaga 
ctgccgcccc cgcctgcgca acccgaggtc 
cccctgacag aggacagcaa gaaacgcagt 
acccagtacc gcagctggta ccttgcatac 
tcatggaccc tgctttgcac tcctgacgta 
ttgccagaca tgatgcaaga ccccgtgacc 
ccggtggtgg gcgccgagct gttgcccgtg 
gtctactccc aactcatccg ccagtttacc 
gagaaccaga ttttggcgcg cccgcccagc 
tcctgctctc acagatcacg ggacgctacc 
agtgaccatt actgacgcca gacgccgcac 
agtctcgccg cgcgtcctat cgagccgcac 
cccagcaata acacaggctg gggcctgcgc 
aagcgctccg accaacaccc agtgcgcgtg 
cacaaacgcg gccgcactgg gcgcaccacc 
gaggcgcgca actacacgce caccgccgcc 
gaccgtggtg cgcggagccc ggcgctatgc 
acgtcgccac cgccgccgac ccggcactgc 
ccgcgcacgt cgcaccggcc gacgggcggc 
tattgtcact gtgcccccca ggtccaggcg 
tagtgctatg actcagggtc gcaggggcaa 
cctgcgcgtg cccgtgcgca ccccgccccc 
cttagactcg tactgttgta tgtatccagc 
gcgcaaaatc aaagaagaga tgctccaggt 
gaaggaagag caggattaca agccccgaaa 
aatgatgatg atgaacttga cgacgaggtg 
cgggtacagt ggaaaggtcg acgcgtaaaa 
tttacgcccg gtgagcgctc cacccgccac 
cgacgaggac ctgcttgagc acggccaacg 
ggcataagga catgctggcg ttgccgctgg 
ccgtaacact gcagcaggtg ctgcccgcgc 
agcgcgagtc tggtgacttg gcacccaccg 
tggaagatgt cttggaaaaa atgaccgtgg 
ggccaatcaa gcaggtggcg ccgggactgg 
ctaccagtag caccagtatt gccaccgcca 
ttgcctcagc ggtggcggat gccgcggtgc 
ctacggaggt gcaaacggac ccgtggatgt 
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13921 ttcgcgtttc cagccccccg gccgcccgcg 
13981 gctactgccc gaatatgccc tacatccttc 
14041 cacctaccgc cccagaagac gagcaactac 
14101 ccgccgtcgc cgtcgccagc ccgtgctggc 
14161 aggaggcagg accctggtgc tgccaacagc 
14221 ggtctttgtg gttcttgcag atatggccct 
14281 attccgagga agaatgcacc gtaggagggg 
14341 gcgtcgtgcg caccaccggc ggcggcgcgc 
14401 gcccctcctt attccactga tcgccgcggc 
144 61 ggccttgcag gcgcagagac actgaattaa 
14521 aaaaagtact ggactctcac gctcgcttgg 
14581 tcaactttgc gtctctggcc ccgccgacac 
14641 gatatcggca ccagcaatat gagcggtggc 
14701 aattaaaaat ttcggttcca ccgttaagaa 
14761 aggccagatg ctgagggata agttgaaaga 
14821 gcctggcctc tggcattagc ggggtgggtg 
14881 attaacagta agcttgatcc ccgccctccc 
14 941 gtgtctccag aggggcggtg gcgaaaagcg 
15001 tgacgcaaat agacgagcct ccctcgtacg 
15061 cccgtcccat cgcgcccatg gctaccggag 
15121 acctgccctc cccccgcccg acacccagca 
15181 tgttgtaacc cgtcctagcc gcgcgtccct 
15241 gcggcccgta gccagtggca actggcaaag 
15301 tgcaatccct gaagcgccga cgatgcttct 
15361 atgcgtccat gtcgccgcca gaggagctgc 
15421 gctacccctt cgatgatgcc gcagtggtct 
15481 gagtacctga gccccgggct ggtgcagttt 
15541 aataacaagt ttaagaaacc ccacggtggc 
15601 ccagcgtttg acgctgcggt tcatccctgt 
15661 ggcgcggttc accctagctg tgggtgataa 
15721 tgacatccgc ggcgtgctgg acaggggccc 
15781 acaacgccct ggcctcccaa gggtgcccca 
15841 gctcttgaaa taaacctaag aagaagagga 
15901 agctgagcaa gcaaaaacta cacgtatttg 
15961 caaaggaggg tattcaaata ggtgtcgaag 
16021 atttcaacct gaacctcaaa taaggagaat 
16081 tgcagctggg aggagtacct aaaaaagcac 
16141 tgcaaaaccc acaaatgaaa atggagggca 
16201 gctagaaagt aaagtggaaa tgaatttttt 
16261 gcaatggtga ttaacttact acctaagtgg 
16321 accccagaca ctcatatttc ttacatgccc 
16381 atgggccaac aatctatgcc caacaggcct 
16441 ggtctaatgt attacaacag cacgggtaat 
16501 ttgaatgctg ttgtagattt gcaagacaag 
16561 ttgattccat tggtgataga accaggtact 
16621 atgatccaga tgttagaatt attgaaatac 
16681 ctgctttcca ctgggaggtg tgattaatac 
16741 aggtcaggaa aataggatgg gaaaagaata 
16801 aataagagtt ggaaataatt ttgccatgga 
16861 tttcctgtac tccaacatag cgctgtattt 
16921 cagtaaaaat ttctgataac ccaaacaacc 
16981 cccgggcctc agtggactgc tacattaacc 
17041 acaacgtcaa cccatttaac caccaccgca 
17101 tgggcaatgg tcgctatgtg cccttccaca 
17161 aaaacctcct tctcctgccg ggctcataca 
17221 acatggttct gcagagctcc ctaggaaatg 
17281 ttgatagcat ttgcctttac gccaccttcc 
17341 cttgaggcca tgcttagaaa cgacaccaac 
17401 gccaacatgt ctctacccta tacccgccaa 
17461 ccgcaactgg gcggctttcc gcggctgggc 



ccgttcgagg aagtacggcg ccgccagcgc 
cattgcgcct acccccggct atcgtggcta 
ccgacgccga accaccactg gaacccgccg 
cccgatttcc gtgcgcaggg tggctcgcga 
gcgctaccac cccagcatcg tttaaaagcc 
cacctgccgc ctccgtttcc cggtgccggg 
catggccggc cacggcctga cgggcggcat 
gtcgcaccgt cgcatgcgcg gcggtatcct 
gattggcgcc gtgcccggaa ttgcatccgt 
aaacaagttg catgtggaaa ataacaaaat 
tcctgtaact attttgtaga atggaagaca 
ggctcgcgcc cgttcatggg aaactggcaa 
gccttcagct ggggctcgct gtggagcggc 
ctatggcagc aaggcctgga acagcagcac 
agcaaaattt ccaacaaaag gtggtagatg 
gacctggcca accaggcagt gcaaaataag 
gtagaggagc ctccaccggc cgtggagaca 
tccgccgccc cgacagcjgaa gaaactctgg 
aggaggcact aaagcaaggc ctgcccacca 
tgctgggcca gcacacaccc gtaacgctgg 
gaaacctgtg ctgccaggcc cgaccgccgt 
gcgccgcgcc gccagcggtc cgcgatcgtt 
cacactgaac agcatcgtgg gtgctggggg 
gatagctaac gtgtcgtatg tgtgtcatgt 
tgagccgccg cgcgcccgct ttccaagatg 
tacatgcaca tctcgggcca ggacgcctcg 
gcccgcgcca ccgagacgta cttcagcctg 
gcctacgcac gacgtgacca cagaccggtc 
ggaccgtgag gatactgcgt actcgtacaa 
ccgtgtgctg gacatggctt ccacgtactt 
tacttttaag cccttactct ggcactgcct 
aatccttgcg aatgggatga agctgctact 
cgatgacaac gaagacgaag tagacgagca 
ggcaggcgcc ttattctggt ataaatatta 
gtcaaacaac ctaaatatgc cgataaaaca 
ctcagtggta cgaaacaaga aattaaatca 
taccccaatg aaaccatgtt acggttcata 
aggcattctt gtaaagcaac aaaatggaaa 
ctaactaact agtaggcagg ccagccgcag 
tattgtacag tgaagatgta gatataagaa 
actattaagg aaggtaactc acgagaacta 
aattacattg cttttaggga caattttatt 
atgggtgttc tggcgggcca agcatcgcag 
aaacaacaga gctttcatac cagcttttgc 
tttctatgtg gaatcaggct gttgacagct 
atggaactag aagatgaact taccaaatta 
agagactctt accaaggtaa aacctaaaac 
gctacagaat tttacttaga taaaataaga 
aatcaatcta aatgccaacc tgtggagaaa 
gcccgacaag ctaaagtaca gtccttccaa 
tacgactaca tgaacaagcg agtggtggct 
ttggagcacg ctggtccctt gactatatgg 
atgctggcct gcgctaccgc tcaatgttgc 
tccaggtgcc tcagaagttc tttgccatta 
cctacgagtg gaacttcagg aaggatgtta 
acctaagggt tgacggagcc agcattaagt 
ttccccatgg cccacaacac cgcctccacg 
gaccagtcct ttaacgacta tctctccgcc 
cgctaccaac gtgcccatat ccatcccctc 
cttcacgcgc cttaagacta aggaaacccc 
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17521 
17581 
17641 
17701 
17761 
17821 
17881 
17941 
18001 
18061 
18121 
18181 
18241 
18301 
18361 
18421 
18481 
18541 
18601 
18661 
18721 
18781 
18841 
18901 
18961 
19021 
19081 
19141 
19201 
19261 
19321 
19381 
19441 
19501 
19561 
19621 
19681 
19741 
19801 
19861 
19921 
19981 
20041 
20101 
20161 
20221 
20281 
20341 
20401 
20461 
20521 
20581 
20641 
20701 
20761 
20821 
20881 
20941 
21001 
21061 



atcactgggc 
tggaaccttt 
cagctggcct 
gttgacgggg 
caaatgctag 
ggaccgcatg 
atactaaata 
atttgttggc 
ctatccgctt 
atcgcaccct 
acctgggcca 
ggatcccatg 
gtgtgacacc 
ggccggcaac 
tccagtgagc 
gggctaccta 
atagtcaata 
ccgcactcaa 
caggtttacc 
cgaccgctgt 
ctgtggacta 
atggatcaca 
ccccaggtac 
cactcgccct 
cttgaaaaac 
ttatttgtta 
aatcaaaggg 
ggtgtttagt 
cactccacag 
agtcgcagtt 
ggaacactat 
ccgcgtccag 
ccaaaaaggg 
gaccgtgccc 
aagccacctg 
gattggccgg 
ccacatttcg 
cgcgctgccc 
tgcttccgtg 
cgcagcccgt 
gcaggaatcg 
cgcggtgctc 
gcagtagttt 
gcgcagcctc 
accgtaattt 
cgcgccactg 
ttgattagca 
ttcctcgctg 
gcttcttttt 
ctgggtgtgc 
cgccgcctca 
gacgacacgt 
ggtttcgcgc 
tcatggagtc 
cgcctccacc 
aggaggagga 
cgctcagtac 
gaacaagtcg 
gctgttgaag 
cgatgtgccc 



tcgggctacg 
tacctcaacc 
ggcaatgacc 
agggttacaa 
ctaactatta 
tactccttct 
acaaggacta 
taccttgccc 
ataggcaaga 
ttggcgcatc 
aaaccttctc 
gacgagccca 
aagccgcacc 
gccacaacat 
aggaactgaa 
tgacaagcgc 
cggccggtcg 
aaacatgcta 
agtttgagta 
ataacgctgg 
ttctgctgca 
accccaccat 
agcccaccct 
acttccgcag 
aatagtaaaa 
cactctcggg 
gttctgccgc 
gctccactta 
gctgcgcacc 
ggggcctccg 
cagcgccggg 
gtcctccgcg 
cgcgtgccca 
ggtctgggcg 
agcctttgcg 
acaggccgcg 
gccccaccgg 
gttttcgctc 
tagacactta 
gggctcgtga 
ccccatcatc 
ctcgttcagc 
gaagttcgcc 
catgcccttc 
cactttccgc 
ggtcgtcttc 
ccggtgggtt 
tccacgatta 
ctttcttggg 
gcggcaccag 
tcctgctttt 
cctccatggt 
tgctcctctt 
agtcgagaag 
gatgccgcca 
agtgattatc 
caacagagga 
ggcgggggga 
catctgcagc 
ctccgccata 



acccttatta 
acacctttaa 
gcctgcctta 
cgttgcccag 
acattggcta 
ttagaaactt 
ccaacaggtg 
ccaccatgcg 
ccgcagttga 
ccattctcca 
tacgccaact 
cccttcttta 
gcggcgtcat 
aaagaagcaa 
agccattgtc 
tttcctaggc 
cgagactggg 
cctctttgag 
cgagtcactc 
aaaagtccac 
tgtttctcca 
gaaccttatt 
gcgtcgcaac 
ccacagtgcg 
ataatgtact 
tgattattta 
gcatcgctat 
aactcaggca 
atcaccaacg 
ccctgcgcgc 
tggtgcacgc 
ttgctcaggg 
ggctttgagt 
ttaggataca 
ccttcagaga 
tcgtgcacgc 
ttcttcacga 
gtcacatcca 
agctcgcctt 
tgcttgtagg 
gtcacaaagg 
caggtcttgc 
tttagatcgt 
ctcccacgca 
ttcgctgggc 
attcagccgc 
gctgaaaccc 
cctctggtga 
cgcaatggcc 
cgcgtcttgt 
tgggggcggc 
tgggggagcg 
cccgactggc 
aaggacagcc 
acgcgcctac 
gagcaggacc 
ataaaaagca 
gcgaaaggca 
gccagtgcgc 
gcggatgtca 



cacctactct 
gaaggtggcc 
cccccaaccg 
tgtaacatga 
ccagggcttc 
accagcccat 
ggcatcctac 
cgaaggacag 
cagcattacc 
gtaactttat 
ccgcccacgc 
ttgttttgtt 
cgaaaccgtg 
gcaacatcaa 
aaagatcttg 
tttgtttctc 
ggcgtacact 
ccctttggct 
ctgcgccgta 
ccaaagcgta 
cgcctttgcc 
accggggtac 
caggaacagc 
cagattagga 
agagacactt 
cccccaccct 
gcgccactgg 
caaccatccg 
cgtttagcag 
gcgagttgcg 
tggccagcac 
cgaacggagt 
tgcactcgca 
gcgcctgcat 
agaacatgcc 
agcaccttgc 
tcttggcctt 
tttcaatcac 
cgatctcagc 
tcacctctgc 
tcttgttgct 
atacggccgc 
tatccacgtg 
gacacgatcg 
tcttcctctt 
cgcactgtgc 
accatttgta 
tggcgggcgc 
aaatccgccg 
gatgagtctt 
ccggggaggg 
tcgcgccgca 
catttccttc 
taacccgccc 
cacccttccc 
caggttttgt 
aagaaccagg 
tggcgactac 
cattatctgc 
gccttgccta 



ggctctatac 
attacctttg 
agtttgaaat 
ccaaagactg 
tatatcccag 
gagccgtcag 
accaacaaca 
gcctaccctg 
caagaaaagt 
gtccatgggc 
gctagacatg 
tgaagtcttt 
tacctgcgca 
caacagctgc 
gttggtgggc 
cacacaagct 
ggatggcctt 
tttctgacca 
gcgccattgc 
caggggccca 
aacctggccc 
ccaactccat 
tctacagctt 
gcgccacttc 
tcaataaagg 
tgccgtctgc 
cagggacacg 
cggcagctcg 
gtcgggcgcc 
atacacaggg 
gctcttgtcg 
caactttggt 
ccgtagtggc 
aaaagecttg 
gcaagacttg 
gtcggtgttg 
gctagactgc 
gtgctcctta 
gcagcggtgc 
aaacgactgc 
ggtgaaggtc 
cagagcttcc 
gtacttgtcc 
gcacactcag 
cctcttgcgt 
gcttacctcc 
gcgccacatc 
tcgggcttgg 
ccgaggtcga 
cctcgtcctc 
cggcggcgac 
ccgcgtccgc 
tcctataggc 
cctcctgagt 
cgtcgaggca 
taagcgaaga 
acaacgcaga 
ctagatgtgg 
gacgcgttgc 
cgaacgccac 



cctacctaga 

actcttctgt 

taagcgctca 

gttcctggta 

agagctacaa 

gtggtggatg 

acaactctgg 

ctaacttccc 

ttactttgcg 

gcactcacag 

actttgaggt 

gacgtggtcc 

cgcccttctc 

cgccatgggc 

cattattttt 

cgcctgcgcc 

tgcctggaac 

gcgactcaag 

ttccttcccc 

actcggccgc 

caaacctccc 

gctcaacagt 

cctggagcgc 

tttttgttca 

caaattgctt 

gccgtttaaa 

ttgcgatact 

gtgaagtttt 

gatatcttga 

ttgcagcact 

gagatcagat 

agctgccttc 

atcaaaaggt 

atctgcttaa 

ccggaaaact 

gagatctgca 

tccttcagcg 

tttatcataa 

agccacaacg . 

aggtacgcct 

agctgcaacc 

acttggtcag 

atcagcgcgc 

cgggttcatc 

ccgcatacca 

tttgccatgc 

ttcttctttc 

gaggaagggc 

tggccgcggg 

ggactcgata 

ggggagcggg 

gctcgggggt 

agaaaaagaa 

tcgccaccac 

cccccgcttg 

cgacgaggac 

ggcaaacgag 

gagacgacgt 

aagagcgcag 

ctattctcac 
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21121 
21181 
21241 
21301 
21361 
21421 
21481 
21541 
21601 
21661 
21721 
21781 
21841 
21901 
21961 
22021 
22081 
22141 
22201 
22261 
22321 
22381 
22441 
22501 
22561 
22621 
22681 
22741 
22801 
22861 
22921 
22981 
23041 
23101 
23161 
23221 
23281 
23341 
23401 
23461 
23521 
23581 
23641 
23701 
23761 
23821 
23881 
23941 
24001 
24061 
24121 
24181 
24241 
24301 
24361 
24421 
24481 
24541 
24601 
24661 



cgcgcgtacc 
cttcctaccc 
actgcaagat 
tgcggcaggg 
agggtgcttg 
aaatagaaag 
tactaaaacg 
aggtcatgag 
atgcaaattt 
agcgcgctgg 
ggccgcagtg 
gatgcagcgc 
ggcctgcaag 
cgaaaaccgc 
actacgtccg 
tttggcagca 
acttgaagga 
tcattttccc 
aaagcatgtt 
cctgctgtgc 
tttggggcca 
tggaagacgt 
cgccaccgct 
tttgagctgc 
tccggggctg 
cgagattagg 
cattacccag 
tctgctacga 
aatccccccg 
ggcacccaaa 
acagtcaggc 
gcctagacga 
tcgcattccc 
ctccgctcct 
cactggaacc 
gcgccaaggc 
actgtggggg 
ttcccccgta 
cggcggcagc 
aagactctga 
cgtctggcgc 
tgtatgtcta 
gtctctgcga 
gcacgctgga 
agtttcgcgc 
cggccgccag 
tacatgtgga 
acccgaataa 
cgcccaccga 
ccttaatccc 
actgtggtac 
gcttgcgggc 
caatcagagg 
tccgtccgga 
caggcaatcc 
ctgcaattta 
cggccactat 
ctacgactga 
tcgccgccac 
aggatcatat 



ccccaaaccg 
cgtatttgcc 
acccctatcc 
cgctgtcata 
gacgcgacga 
tcactctgga 
cagcatcgag 
cacagtcatg 
gcaagaaaca 
cttcaaacgc 
ctcgttaccg 
aagctagagg 
atctccaacg 
cttgggcaaa 
cgactgcgtt 
gtgcttggag 
cctatggacg 
cgaacgcctg 
gcagaacttt 
acttcctagc 
ctgctacctt 
gagcggtgac 
ccctggtttg 
agggtccctc 
tggacgtcgg 
ttctacgaag 
ggccacattc 
aagggacggg 
cccgccgcag 
aagaagctgc 
agaggaggtt 
ggaagcttcc 
ctccgccggc 
caggcgccgc 
agggccggta 
taccgctcat 
caacatctcc 
acatcctgca 
aaagaacagc 
caaagcccaa 
ccaacgaacc 
tatttcaaca 
tccctccacc 
agacgcggag 
cctttctcaa 
ccacctgctt 
gttaccagcc 
actacatgag 
aaccgaattc 
cgtagttggc 
ttcccagaga 
ggctttcgtc 
gcggaggata 
cgggacattt 
taactctgca 
ttgaggagtt 
ccggatcaat 
atgttaagtg 
aagtgctttg 
cgagggcccg 



ccaagaaaac 
gtgccagagg 
tgccgtgcca 
cctgatatcg 
gaagcgcgcg 
gtgttggtgg 
gtcacccact 
agtgagctga 
aacaagagga 
gcgagcctgc 
tggagcttga 
aaacattgca 
tggagctctg 
acgtgcttca 
tacttatttc 
gagtgcaacc 
gccttcaacg 
cttaaaaccc 
aggaacttta 
gactttgtgc 
ctgcagctag 
ggtctactgg 
caattcgcag 
cgcctgacga 
cttaccttcg 
accaatcccg 
ttggccaatt 
gggtttactt 
ccctatcagc 
agctgccgcc 
ttggacgagg 
gaggtcgaag 
cgccccagaa 
cggcactgcc 
agtccaagca 
ggcggcgggc 
ttccgcccgc 
ttactaccgt 
aacagcagcg 
gaaatccaca 
cgtatcgacc 
gagcaggggc 
cgcagctgcc 
gctctcttca 
atttaagcgc 
gttgtcagcg 
acaaatggga 
cgcgggaccc 
tcctggaaca 
ccgcctgccc 
cgcccaggcc 
acagggtggc 
ttcagactca 
cagatcggcg 
gacctcgtcc 
tgtgccatcg 
ttattcctaa 
gagaggcaga 
cccgcgactc 
gcgcacggcg 



ggcacatgcg 
tgcttgccac 
accgcagccg 
cctcgctcaa 
gcaaacgctc 
aactcgaggg 
ttgcctaccc 
tcgtgcgccg 
gggcctaccc 
cgacttggag 
gtgcatgcag 
ctacaccttt 
caacctggtc 
ttccacgctc 
tatgctacac 
tcaaggagct 
agcgctccgt 
tgcaacaggg 
tcctagagcg 
ccattaagta 
ccaactacct 
agtgtcactg 
ctgcttaacg 
aaagtccgcg 
caaatttgta 
cccgcctaat 
gcaagccatc 
ggacccccag 
agcagccgcg 
gccacccacc 
aggaggagga 
aggtgtcaga 
atcggcaacc 
cgttcgccga 
gccgccgccg 
acaagaacgc 
cgctttcttc 
catctctaca 
gccacacaga 
gcggcggcag 
cgcgagctta 
caagaacaag 
tgtatcacaa 
gtaaatactg 
gaaaactacg 
ccattatgag 
cttgcggctg 
cacatgatat 
ggcggctatt 
tggtgtacca 
gaagttcaga 
ggtcgcccgg 
acgacgagtc 
gcgccggccg 
tctgagccgc 
gtctacttta 
ctttgacgcg 
gcaactgcgc 
cggtgagttt 
tccggcttac 



agcccaaccc 
ctatcacatc 
agcggacaag 
cgaagtgcca 
tgcaacaagg 
tgacaacgcg 
ggcacttaac 
tgcgcagccc 
gcagttggcg 
gagcgacgca 
cggttctttg 
cgacagggct 
tcctaccttg 
aagggcggag 
ctggcagacg 
gcagaaactg 
ggccgcgcac 
tctgccagac 
ctcaggaatc 
ccgcgaatgc 
tgcctaccac 
tcgctgcaac 
aaagtcaaat 
gctccggggt 
cctgaggact 
gcggagctta 
aacaaagccc 
tccggcgagg 
ggcccttgcc 
ggacgaggag 
catgatggaa 
cgaaacaccg 
ggttccagca 
cccaaccgta 
ttagcccaag 
catagttgct 
tctaccatca 
gcccatactg 
agcaaaggcg 
cagcaggagg 
gaaacaggat 
agctgaaaat 
aagcgaagat 
cgcgctgact 
tcatctccag 
caaggaaatt 
gagctgccca 
cccgggtcaa 
accaccacac 
ggaaagtccc 
tgactaactc 
gcagggtata 
ggtgagctcc 
cctcttcatt 
gctctggagg 
accccttcct 
gtaaaggact 
ctgaaacacc 
tgcttacttt 
cgcccaggga 



gccgcctcaa 
tttttccaaa 
cagctggcct 
aaaatctttg 
aaaacaagcg 
cgcctagccg 
ctacccccca 
ctggagaggg 
acgagcagct 
aactaatgat 
ctgacccgga 
acgtacgcca 
gaattttgca 
gcgcgccgcg 
gccatgggcg 
ctaaagcaaa 
ctggcggaca 
ttcaccagtc 
ttgcccgcca 
cctccgccgc 
tctgacataa 
ctatgcaccc 
tatcggtacc 
tgaaactcac 
accacgccca 
ccgcctgcgt 
gccaagagtt 
agctcaaccc 
ttcccaggat 
gaatactggg 
gactgggaga 
tcaccctcgg 
tggctacaac 
gatgggacac 
agcaacaaca 
tgcttgcaag 
cggcgtggcc 
caccggcggc 
accggatagc 
aggagcgctg 
tttcccactc 
aaaaaacaag 
cagcttcggc 
cttaaggact 
cggccacacc 
cccaccgccc 
agactactca 
cggaataacg 
ctcgtaataa 
gcctcccacc 
aggggcggca 
actcacctga 
tcgcttggtc 
cacgcctcgt 
cattggaact 
cgggacctcc 
cggcggacgg 
tggtccactg 
gaattgcccg 
gagcttgccc 
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24721 
24781 
24841 
24901 
24961 
25021 
25081 
25141 
25201 
25261 
25321 
25381 
25441 
25501 
25561 
25621 
25681 
25741 
25801 
25861 
25921 
25981 
26041 
26101 
26161 
26221 
26281 
26341 
26401 
26461 
26521 
26581 
26641 
26701 
26761 
26821 
26881 
26941 
27001 
27061 
27121 
27181 
27241 
27301 
27361 
27421 
27481 
27541 
27601 
27661 
27721 
27781 
27841 
27901 
27961 
28021 
28081 
28141 
28201 
28261 



gtagcctgat 
tgtgttctca 
aattcagtac 
agcatcactt 
cttgccctcc 
atactaaata 
gttgttgcag 
atgacacgga 
caatgggttt 
tacctccaat 
caaccttacc 
acaataaacc 
gccgccgcac 
aaccgtgcac 
aggaaagcta 
ctatcactgc 
agagcccatt 
gtaacagacg 
acttccttgc 
cttaatgtag 
agttatccgt 
tattaaactc 
gcttcaaaca 
acgctacagc 
ccaaacaaca 
acaaggctat 
cagtaaggaa 
actgtagaac 
agtcaaatac 
tggaacagtt 
caattccttc 
agcctataca 
aggtaaaact 
aaactaaacc 
actccaagtg 
tgaaatattt 
tgttgttatg 
attcagtagt 
tcacagaacc 
ttcctccccg 
tgttatattc 
ccccgggcca 
ccaacttgcg 
agtcataatc 
gccgccgccg 
gcaccgcccg 
ttaaatcagc 
aaggcgctgt 
aagcgcaggt 
ttttggcatg 
gccatccacc 
ctgcagggaa 
catcatgctc 
caggattaca 
cagcgtaaat 
agtgttacat 
aaaaggaggt 
tcgtagtcgt 
ggtgcgggcg 
gtagttgtag 



tcgggagttt 
ctgtgatttg 
ccggggactt 
acttaaaatc 
ctcccagctc 
ggaatgtcag 
atgaagcgcg 
aaccggatcc 
caagagagtc 
ggcatgcttg 
tcccaaaatg 
tggaaatatc 
ctctaatggt 
gactccaaac 
gccctgcaaa 
ctccaccccc 
tatacaacaa 
acctaaacac 
aaactaaagt 
caggaggact 
ttgatgctca 
agcccacaac 
attccaaaag 
catagccatt 
aactccccta 
ggttcctaaa 
acaaaataat 
taaatgcaga 
ttgctacagt 
acaagtgctc 
ctggacccag 
aacgctgttg 
aagccaaaag 
tgtaacacta 
catactctat 
gccacatcct 
tttcaacgtg 
atagccccac 
ctagtattca 
gcctggcctt 
cacacggttt 
gctcacttaa 
gttgcttaac 
gtgcatcagg 
ctccgtcctg 
cagcataagg 
acagtaactg 
atccaaagct 
agattaagtg 
ttgtaattca 
accatcctaa 
ccgggactgg 
gtcatgatat 
agctcctccc 
cccacactgc 
tcgggcagca 
agacgatccc 
catgccaaat 
tgacaaacag 
tatatccact 



acccagccgc 
caactgtcct 
acttacctta 
agttagcaaa 
tggtattgca 
tttcctcctg 
caagaccgtc 
tccaactgtt 
cccctggggt 
cgctcaaaat 
taaccactgt 
tgcacccctc 
cgcgggcaac 
ttaagcattg 
catcaggccc 
tcctaactac 
aataggaaaa 
tttgaccgta 
tactggagcc 
aaggattgat 
aaaccaacta 
ttggatatta 
cttgaggtta 
aatgcaggag 
ccaaaacaaa 
ctaggaactg 
gataagctaa 
gaaagatgct 
ttctagtttt 
atcttattat 
aatattggaa 
gatttatgcc 
taaactattg 
accattacac 
gtctattttc 
cttacacttt 
tttatttttc 
caccacatag 
acctgccacc 
aaaaagcaat 
ccttgtcgag 
gttcatgtcg 
gggcggcgaa 
atagggcggt 
caggaataca 
cgccttgtcc 
cagcacagca 
catggcgggg 
gcgacccctc 
ccacctcccg 
accagcctag 
aacaatgaca 
caatgttggc 
gcgttagaac 
agggaagacc 
gcggatgatc 
tactgtacgg 
ggaacgccgg 
atctgcgtct 
ctctcaaagc 



cccctgctag 
aaccctggat 
accctttaac 
tttctgtcca 
gcttcctcct 
ttcctgtcca 
tgaagatacc 
gccttttctt 
actctctttg 
gggcaacggc 
gagcccacct 
acagttacct 
acactcacca 
ccacccaagg 
cctccaccac 
tgccactggt 
ctaggactaa 
gcaactggtc 
ttgggttttg 
tctcaaaaca 
aatctaagac 
actaacaaca 
acctaagcac 
atgggcttga 
aattaggcca 
gccttagttt 
ctttgtggac 
aaactcactt 
ggctgttaaa 
aagatttgac 
ctttagaaat 
taacctatca 
tcagtctaag 
taaacggtac 
tatgggactg 
tcatacattg 
taattgcaga 
cttatacaga 
tccctcccaa 
catatcatgg 
ccaaacgctc 
ctgtccagct 
ggagaagtcc 
ggtgctgcag 
acatggcagt 
tccgggcaca 
ccacaatatt 
accacagaac 
ataaacacgc 
gtaccatata 
gccaaacctg 
gtggagagcc 
acaacacagg 
catatcccag 
tcgcacgtaa 
ctccagtatg 
agtgcgccga 
acgtagtcat 
ccggtctcgc 
atccaggccg 



ttgagcggga 
tacatcaaga 
taaataaaaa 
gtttattcag 
ggctgcaaac 
tccgcaccca 
ttccaacccc 
actcctccct 
cgcctatccg 
ctctctctgg 
ctacaaaaaa 
cagaagccct 
tgcaatcaca 
cacccctcca 
caccgatagc 
agcttgggca 
agtacggggc 
caggtgtgac 
attcacaagg 
gacgccttat 
taggacaggg 
aaggccttta 
tgccaagggg 
atttggttca 
tggcctagaa 
tgacagcaca 
cacaccagct 
tggtcttaac 
ggcagtttgg 
agaaatggag 
ggagatctta 
gcttatacca 
tttaacttaa 
acaaggaaac 
gtctggccac 
cccaagaata 
aaatttcaag 
tcaccgtacc 
cacacagagt 
gtaacagaca 
atcagtgata 
gctgagccac 
acgcctacat 
cagcgcgcga 
ggtctcctca 
gcagcgcacc 
gttcaaaata 
ccacgtggcc 
tggacataaa 
aacctctgat 
cccgccggcc 
caggactcgt 
cacacgtgca 
ggaacaaccc 
ctcacgttgt 
gtagcgcggg 
gacaaccgag 
atttcctgaa 
cgcttagatc 
ccccctggct 



caggggaccc 
tcctctagtt 
aataaataaa 
cagcacctcc 
tttctccaca 
ctatcttcat 
gtgtatccat 
ttgtatcccc 
aacctctagt 
acgaggccgg 
ccaaagtcaa 
aactgtggct 
ggccccgcct 
cagtgtcaga 
agtaccctta 
ttgacttgaa 
tcctttgcat 
tattaataat 
caatatgcaa 
acttgatgtt 
cccttctttt 
cttgtttaca 
ttgatgtttg 
cctaatgcaa 
tttgattcaa 
ggtgccatta 
ccatctccta 
aaaatgtggc 
ctccaatatc 
tgctactaaa 
ctgaaggcac 
aaatctcaca 
acggagaaca 
aggagacaca 
aactaattaa 
aagaatcgtt 
tcatttttct 
ttaatcaaac 
acacagtcct 
tattcttagg 
ttaataaact 
aggctgctgt 
gggggttagg 
ataaactgct 
gcgatgattc 
ctgatctcac 
cccacagtgc 
atcataccac 
cattaccttc 
taaacatggc 
ctanactaca 
aaccatggat 
tacacttcct 
attcctgaat 
gcattgtcaa 
tttctgtctc 
atcgtgttgg 
gcaaaaccaa 
gctctgtgta 
tcgggttcta 
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28321 
28381 
28441 
28501 
28561 
28621 
28681 
28741 
28801 
28861 
28921 
28981 
29041 
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29281 
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29461 
29521 
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29821 
29881 
29941 
30001 
30061 



tgtaaactcc 
ccagccaacc 
agaagccatg 
atctattaag 
cagataatgg 
tccaagtgga 
ccttcaacca 
tacccgaata 
cctcaagcag 
aaaagcggaa 
tgaacataat 
gacaaaagaa 
cccgatgtaa 
caggcaaagc 
caggtaagct 
ggtttctgca 
gcctgtctta 
ggcgtgaccg 
tcatgtccgg 
gtgctaaaaa 
aacattacac 
aaacacctga 
acagcgcttc 
aattaaaaaa 
ccaagtgcag 
aaacaaccca 
aacttcctca 
actaacaatt 
ttcccaccgc 
ataccaaaat 



// 



ttcatgcgcc 
tacacattcg 
ttttttttat 
tgaacgcgcc 
catttgtaag 
cgtaaaggct 
tgcccaaata 
ttaagtccgg 
cgaatcatga 
caattaaaca 
cgtgcaggtc 
cccacactga 
gcttgttgca 
ctcgcagcaa 
ccggaaccac 
ataaacaaca 
caacaaggaa 
taaaaaacta 
agtcataatg 
gcgaccgaaa 
gcccccatag 
aaaaccctcc 
cacagcggca 
caaccaactc 
agcgagtata 
gaaaaccgca 
aatcgtcact 
cccaacacat 
cccgccgcca 
aaggtaatat 



gctgccctga 
ttctgcgagt 
ttccaaaagt 
tcccctccgg 
atgttgcaca 
aaacccttca 
attctcatct 
ccattaagtt 
ttgcaaaaat 
aaaataaccg 
tgcacggacc 
ttatgacacg 
tgggcggcga 
aaagaaagca 
cacaagaaaa 
aaaataaaaa 
aaacaaccct 
ggtcaccgtg 
taagactcgg 
tactaggccc 
gaggtataaa 
tgcctaggca 
gccataacag 
gacacggcac 
tataggaact 
cgcgaaccta 
tcctgctttc 
acaagttact 
cgtcacaaac 
tatttatgat 



taacatccac 
cacacacggg 
aattactaac 
tggcgtggtc 
atggcttcca 
gggtgaatct 
cgccaccttc 
aaaatactcc 
tcaggttcct 
cgatcccgta 
agcgcggcca 
catactcgga 
taataaaatg 
acatcgtagt 
gaacaccatt 
taaaacaaaa 
tataagcata 
attaaaaagc 
taaacacatc 
gggggaatac 
caaaattaat 
aaatagcacc 
tcagccttac 
cagctcaatc 
aaaaaatgac 
cgcccagaaa 
ccacgttacg 
ccgccctaaa 
tccaccccct 



caccgcagaa 
aggagcggga 
caaaacctaa 
aaactctaca 
aaaggcaaac 
cctctataaa 
tcaatatatc 
agagcgccct 
cacagacctg 
ggtcccttcg 
ccttccccgc 
gctatgctaa 
caaggtgctg 
catgctcatg 
ttctctcaaa 
aaacaattta 
agacggacta 
aaccaccgac 
aggttgattc 
atacccgcag 
aggagaagaa 
ctcccgctcc 
caagtaaaaa 
agtcacagtg 
gtaacggtta 
cgaaagccaa 
tcacttccca 
acctacgtca 
ccattatcat 



taagccacac 
agaggctgga 
caaaatgaag 
agccaaagaa 
ggccctcacg 
cattccagca 
tctaagcaaa 
ccaccttcag 
tataagattc 
cagggccagc 
caggaaccat 
ccagcgtagc 
ctcaaaaata 
cagataaagg 
catgtctgcg 
aacattagaa 
cggccatgcc 
agctcctcgg 
acatcggtca 
gcgtagagac 
aaacaacaat 
agaacaacat 
agaaaaccat 
taaaaaaggg 
aagtccacaa 
aaaacccaac 
ttttaagaaa 
cccgccccgc 
attggcttca 
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454 Life Sciences has developed proprietary methods for massively 
parallel DNA sequencing. We have applied this technology to re- 
sequencing and mapping human BAG clones to their precise 
chromosomal locations. This preliminary data shows the efficacy of the 
technology to rapidly sample and characterize subsets of sequence 
spanning an entire genome or a specific chromosomal location. The 
novel DNA sequencing method consists of three steps: template 
preparation, solid phase amplification, and solid phase DNA sequencing. 
Several thousand to several hundreds of thousands of DNA sequencing 
reactions are performed simultaneously on glass plates containing 300 
thousand to 1 million, 75 picoliter volume wells. Average read length of 
each fragment is consistently greater than 50 bases. The starting point 
for genome sequencing involves a single template preparation and an 
absence of a bacterial plasmid cloning step, thus greatly reducing costs 
and increasing the throughput of our system. In addition, we are 
completing development of a new software algorithm for de novo whole 
genome assembly. Sequencing results from human BAC clones will be 
presented and discussed. 



Our novel methodology requires only a single sample preparation per genome, utilizes simultaneous 
clonal amplification of shotgun fragments in sub-nanoliter microreactors, without the use of time- 
consuming cloning steps. The product of each microreactor is driven to and captured by a 
concomitant solid support. The captured DNAs are delivered to wells on the PicoTiterPlate™ and 
sequenced on 454 Life Sciences' sequencing platform. The details of these steps are illustrated in 
Figure 1 . 
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Figure 1. Streamlined template preparation and amplification process 



1) BAC DNA from clone RP1 1-418C2 was fragmented to sub-kilobase lengths. 

2) The fragment ends were polished, 5' and 3' adaptors ligated onto each fragment, and the 
sample was size fractionated, resulting in products under 500 bases in length. 

3) One strand of these double-stranded products was bound to microparticles, and the free strand 
was eluted as template for the subsequent amplification reaction. 

4) Amplification was conducted in a single reaction preparation, encapsulating the reaction 
reagent mix, a single DNA capture bead, and template in a 40 to 100 picoliter microreactor. 

5) The particular template molecule contained in each individual microreactor was amplified and 
immobilized on the respective DNA capture bead. 

6) The DNA capture beads were extracted and the template DNA was prepared for use on the 
454 sequencer. 



The 454 sequencer generates raw traces for each microreactor, and produces sequence reads in 
FASTA format using a proprietary basecaller program. Adaptors and low quality reads are removed 
and repeats masked before mapping and assembly. 

Human Genome Mapping: 

Each masked read was mapped against the human genome (NCBI build 33) using BLAT and the 
mapped reads (>95% identity) are recorded for each chromosome. 

BAC Assembly: 

Each sequence was mapped against the reference BAC sequence (RP11-418C2) using a proprietary 
alignment algorithm and the resulting alignment was recorded. For sequences that map to the 
genome with >90% accuracy, the software generates a list of individual bases found at a given 
position in the reference genome. The consensus base for each location was computed by averaging 
all mapped bases. This consensus sequence was then compared with the reference sequence to 
calculate total accuracy and coverage. 

We also mixed 3x oversample of reads (950 sequences) generated from conventional Sanger method 
with reads generated from the 454 sequencer and assembled with Phrap using default parameters. 



Human Genome Mapping: 

Out of 8561 mapped reads, 7153 are mapping to human chromosome 12 (Fig. 2a). Of these, 7058 reads 
map to the expected location within chromosome 12 (Fig. 2b). The coordinate boundaries for clone 
RP11-418C2 in NCBI build 33 are 11,818,492-11,986,440, whereas boundaries on the 7058 read stack 
are 1 1 ,816,616-1 1 ,986,51 1 . We also mapped these reads to the mouse genome, and located the BAC to 
the syntenic region on mouse chromosome 6 (data not shown). 
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(a) Mapping against Human Genome (b) Mapping to Human Chr12 

Figure 2. Sequence mapping against human genom and within chr mosome12 



BAC Assembly: 

In a separate sequencing run, we generated 67193 raw reads from this BAC clone. After 
adaptor removal, repeat masking and quality trimming, 39900 reads were assembled 
against the reference sequence (Fig. 3). Genome coverage is 85% and consensus 
accuracy is 98%. Average read length is 84 bases. 
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Figure 3. Frequency of assembled reads across BAC sequence length 



Phrap Assembly: 

Sanger reads alone generated 25 major contigs (>2 kb) with a 76% mapping 
efficiency, whereas Sanger and 454 reads combined produced 18 major contigs with 
a 83% mapping efficiency. 454 reads were able to join and extend Sanger contigs 
into much larger stretches (Fig. 4). 
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We have demonstrated in this study that 454 Life Sciences' novel 
sequencing methodology is capable of producing sufficient shotgun 
sequence coverage of a BAC clone in a single run (done within 1 day). 
The reads can be used to map its precise location in the genome, as well 
as assembling into contigs based on a reference sequence. This is a 
useful tool for whole genome mapping and sequencing. 

We also showed that by combining conventional Sanger method with 
454 technology, we achieve a better de novo assembly outcome for 
whole genome shotgun sequencing. 

We are continuing to develop our quality scoring and trimming algorithm. 
We have completed phase one of our proprietary fragment assembler, 
designed to take advantage of the raw trace signals produced by our 
sequencing-by-synthesis method. This assembler will be available as 
part of 454's commercial sequencing instrument. 



